Adaptive Submodular Maximization in Bandit Setting

نویسندگان

  • Victor Gabillon
  • Branislav Kveton
  • Zheng Wen
  • Brian Eriksson
  • S. Muthukrishnan
چکیده

Maximization of submodular functions has wide applications in machine learning and artificial intelligence. Adaptive submodular maximization has been traditionally studied under the assumption that the model of the world, the expected gain of choosing an item given previously selected items and their states, is known. In this paper, we study the setting where the expected gain is initially unknown, and it is learned by interacting repeatedly with the optimized function. We propose an efficient algorithm for solving our problem and prove that its expected cumulative regret increases logarithmically with time. Our regret bound captures the inherent property of submodular maximization, earlier mistakes are more costly than later ones. We refer to our approach as Optimistic Adaptive Submodular Maximization (OASM) because it trades off exploration and exploitation based on the optimism in the face of uncertainty principle. We evaluate our method on a preference elicitation problem and show that non-trivial K-step policies can be learned from just a few hundred interactions with the problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diffusion Independent Semi-Bandit Influence Maximization

We consider influence maximization (IM) in social networks, which is the problem of maximizing the number of users that become aware of a product by selecting a set of “seed” users to expose the product to. While prior work assumes a known model of information diffusion, we propose a parametrization in terms of pairwise reachability which makes our framework agnostic to the underlying diffusion...

متن کامل

Online Submodular Set Cover, Ranking, and Repeated Active Learning

We propose an online prediction version of submodular set cover with connections to ranking and repeated active learning. In each round, the learning algorithm chooses a sequence of items. The algorithm then receives a monotone submodular function and suffers loss equal to the cover time of the function: the number of items needed, when items are selected in order of the chosen sequence, to ach...

متن کامل

Deterministic & Adaptive Non-Submodular Maximizationvia the Primal Curvature

While greedy algorithms have long been observed to perform well on a wide variety of problems, up to now approximation ratios have only been known for their application to problems having submodular objective functions f . Since many practical problems have non-submodular f , there is a critical need to devise new techniques to bound the performance of greedy algorithms in the case of non-submo...

متن کامل

Stochastic Submodular Maximization

We study stochastic submodular maximization problem with respect to a cardinality constraint. Our model can capture the effect of uncertainty in different problems, such as cascade effects in social networks, capital budgeting, sensor placement, etc. We study non-adaptive and adaptive policies and give optimal constant approximation algorithms for both cases. We also bound the adaptivity gap of...

متن کامل

Non-Monotone Adaptive Submodular Maximization

A wide range of AI problems, such as sensor placement, active learning, and network influence maximization, require sequentially selecting elements from a large set with the goal of optimizing the utility of the selected subset. Moreover, each element that is picked may provide stochastic feedback, which can be used to make smarter decisions about future selections. Finding efficient policies f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013